272 research outputs found

    From media crossing to media mining

    Get PDF
    This paper reviews how the concept of Media Crossing has contributed to the advancement of the application domain of information access and explores directions for a future research agenda. These will include themes that could help to broaden the scope and to incorporate the concept of medium-crossing in a more general approach that not only uses combinations of medium-specific processing, but that also exploits more abstract medium-independent representations, partly based on the foundational work on statistical language models for information retrieval. Three examples of successful applications of media crossing will be presented, with a focus on the aspects that could be considered a first step towards a generalized form of media mining

    Distributed Access to Oral History collections: Fitting Access Technology to the needs of Collection Owners and Researchers

    Get PDF
    In contrast with the large amounts of potential interesting research material in digital multimedia repositories, the opportunities to unveil the gems therein are still very limited. The Oral History project ‘Verteld Verleden’ (Dutch literal translation of Oral History) that is currently running in The Netherlands, focuses on improving access to spoken testimonies in collections, spread over many Dutch cultural heritage institutions, by deploying modern technology both concerning infrastructure and access. Key objective in the project is mapping the various specific requirements of collection owners and researchers regarding both publishing and access by means of current state-of-the-technology. In order to demonstrate the potential, Verteld Verleden develops an Oral History portal that provides access to distributed collections. At the same time, practical step-by-step plans are provided to get to work with modern access technologies. In this way, a solid starting point for sustained access to Oral History collections can be established

    Improved Cyberbullying Detection Through Personal Profiles

    Get PDF
    Online social networks brought a new definition to relationships and communications. One may have hundreds of friends in cyberspace without even having seen their real faces. Along with this transition there is increasing evidence that bullying has transformed as well, from school yards to internet precincts – cyberbullying. Although bullying draws a lot of attention, due to its technical aspects, cyberbullying is not fully understood yet. State-of-the-art studies in cyberbullying detection have mainly focused on the sentiment of terms and the content of conversations, while largely ignoring the involved actors and their interactions. A funny chat between teenage friends, just because of having foul words, can be flagged as bullying while a tenacious intruder with subtle but hurtful comments sneaks out. We hypothesize that incorporation of the potential victim’s profile and their characteristics, into cyberbullying detection improves the discrimination capacity of the procedure. This study outlines a framework for this faceted approach. Our study demonstrated that deploying gender-specific and age-specific features improve the cyberbullying detection accuracy for the MySpace dataset, compared to the conventional approaches. Analysis showed that authors’ information can be leveraged to discriminate between harassing posts and the bullying ones. The main limitation of our experiment was the limited size of the dataset. A larger and more diverse dataset should be developed for future work in cyberbullying detection. Other features which may differentiate writing styles, such as profession, and educational level can also be investigated in this matter. In future stages this approach will be extended by considering the behaviour of actors across social networks, and how they react to a potentially cyberbullying incident. A second line of future research will be to address the various use scenarios for the detection of bullying as well as the corresponding detection approaches that may be required in each of the different types of cyber contexts

    Development of a speech recognition system for Spanish broadcast news

    Get PDF
    This paper reports on the development process of a speech recognition system for Spanish broadcast news within the MESH FP6 project. The system uses the SONIC recognizer developed at the Center for Spoken Language Research (CSLR), University of Colorado. Acoustic and language models were trained using Hub4 broadcast news data. Experiments and evaluation results are reported

    Numerals as Determiners

    Get PDF
    This paper deals with an explanation of the distributional facts in the prenominal structure Dutch NPs. The discussion will be based on the categorial status of various types of numeral. In English as well as in Dutch, the internal structure of NP’s is quite rigid. The position occupied by the head of the phrase, and the relative positions of the other elements within the NP are severely restricted. Their linear order is strict, and there are very few possibilities of movement in or out of the NP. In the absence of syntactic or semantic restrictions, this paper proposes to adopt an alternative categorial status for numerals that derives support from an analysis of the role of the determiner

    Exploration of audiovisual heritage using audio indexing technology

    Get PDF
    This paper discusses audio indexing tools that have been implemented for the disclosure of Dutch audiovisual cultural heritage collections. It explains the role of language models and their adaptation to historical settings and the adaptation of acoustic models for homogeneous audio collections. In addition to the benefits of cross-media linking, the requirements for successful tuning and improvement of available tools for indexing the heterogeneous A/V collections from the cultural heritage domain are reviewed. And finally the paper argues that research is needed to cope with the varying information needs for different types of users

    Twenty-One: a baseline for multilingual multimedia retrieval

    Get PDF

    Automated speech and audio analysis for semantic access to multimedia

    Get PDF
    The deployment and integration of audio processing tools can enhance the semantic annotation of multimedia content, and as a consequence, improve the effectiveness of conceptual access tools. This paper overviews the various ways in which automatic speech and audio analysis can contribute to increased granularity of automatically extracted metadata. A number of techniques will be presented, including the alignment of speech and text resources, large vocabulary speech recognition, key word spotting and speaker classification. The applicability of techniques will be discussed from a media crossing perspective. The added value of the techniques and their potential contribution to the content value chain will be illustrated by the description of two (complementary) demonstrators for browsing broadcast news archives

    Unravelling the voice of Willem Frederik Hermans: an oral history indexing case study

    Get PDF
    • …
    corecore